Allocate dQ, dK, and dV as a catted tensor to save a downstream cat in nvFuser. #59

wujingyue · 2024-03-23T04:56:51Z

The description on the added compile option explains what this optimization does.

This optimization is disabled by default for now. I'll try to enable it by default or even always after #35 is merged and bookend is disabled by default.

Co-authored-by: Masaki Kozuki <mkozuki@nvidia.com>

jjsjann123

👏

thunder/tests/test_cudnn_executor.py

thunder/executors/cudnnex.py

thunder/tests/test_cudnn_executor.py

t-vi

Per @jjsjann123 review. Thank you @wujingyue @jjsjann123 .

wujingyue · 2024-03-26T01:56:24Z

Hey @vedaanta-nvidia, are you interested in reviewing this before I merge?

wujingyue · 2024-03-27T15:50:45Z

Hey @vedaanta-nvidia, are you interested in reviewing this before I merge?

Hey Vedaanta, do you have more comments before I merge?

thunder/executors/cudnnex.py

wujingyue requested review from mruberry, lantiga, robieta and t-vi as code owners March 23, 2024 04:56

Base automatically changed from wjy/exception to main March 23, 2024 12:31

github-actions bot added the has conflicts label Mar 23, 2024

wujingyue and others added 11 commits March 23, 2024 23:35

A follow-up cleanup on #57.

bfccc97

Preallocate dQ, dK, and dV as one tensor for efficient concatenation.

04c24fd

More tests.

4d8361f

Make some functions private to the module.

12f3ea5

Merge the two ops.

61ee179

Rename preformat to preallocate.

c5d3782

Add a knob.

dd88282

Fix the test.

6a49c82

Update thunder/executors/cudnnex.py

ee7c74c

Co-authored-by: Masaki Kozuki <mkozuki@nvidia.com>

Clean up.

92e3fb3

Renaming.

f0c0e17

wujingyue force-pushed the wjy/format branch from d9ad71a to f0c0e17 Compare March 23, 2024 23:42

wujingyue requested a review from carmocca as a code owner March 23, 2024 23:42

wujingyue changed the base branch from main to wjy/clean March 23, 2024 23:42

github-actions bot removed the has conflicts label Mar 23, 2024

Base automatically changed from wjy/clean to main March 24, 2024 13:07

wujingyue added 2 commits March 24, 2024 08:36

Merge branch 'main' into wjy/format

108d906

Comment.

2e1661d

wujingyue requested review from jjsjann123 and vedaanta March 24, 2024 23:46

jjsjann123 approved these changes Mar 25, 2024

View reviewed changes

thunder/tests/test_cudnn_executor.py Show resolved Hide resolved

Comments and renaming.

30cb61e

wujingyue commented Mar 25, 2024

View reviewed changes

thunder/executors/cudnnex.py Show resolved Hide resolved

thunder/tests/test_cudnn_executor.py Show resolved Hide resolved

thunder/tests/test_cudnn_executor.py Show resolved Hide resolved

t-vi approved these changes Mar 25, 2024

View reviewed changes

wujingyue added cudnn enhancement New feature or request labels Mar 26, 2024

vedaanta reviewed Mar 27, 2024

View reviewed changes

thunder/executors/cudnnex.py Show resolved Hide resolved

vedaanta approved these changes Mar 27, 2024

View reviewed changes

thunder/executors/cudnnex.py Show resolved Hide resolved

thunder/executors/cudnnex.py Show resolved Hide resolved

Merge branch 'main' into wjy/format

e625f13

wujingyue merged commit 483c352 into main Mar 27, 2024
37 checks passed

wujingyue deleted the wjy/format branch March 27, 2024 20:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allocate dQ, dK, and dV as a catted tensor to save a downstream cat in nvFuser. #59

Allocate dQ, dK, and dV as a catted tensor to save a downstream cat in nvFuser. #59

wujingyue commented Mar 23, 2024 •

edited

Loading

jjsjann123 left a comment

t-vi left a comment

wujingyue commented Mar 26, 2024

wujingyue commented Mar 27, 2024

Allocate dQ, dK, and dV as a catted tensor to save a downstream cat in nvFuser. #59

Allocate dQ, dK, and dV as a catted tensor to save a downstream cat in nvFuser. #59

Conversation

wujingyue commented Mar 23, 2024 • edited Loading

jjsjann123 left a comment

Choose a reason for hiding this comment

t-vi left a comment

Choose a reason for hiding this comment

wujingyue commented Mar 26, 2024

wujingyue commented Mar 27, 2024

wujingyue commented Mar 23, 2024 •

edited

Loading